Fix `IndirectAttackEvaluator` results not displaying in AI Foundry portal by Copilot · Pull Request #45721 · Azure/azure-sdk-for-python

Copilot · 2026-03-16T18:15:15Z

IndirectAttackEvaluator evaluations showed "Completed" in Foundry but no scores — results were only accessible programmatically. The bug: EvaluationMetrics.XPIA.value was changed to "indirect_attack" for service API communication, causing output columns to be named indirect_attack_label, indirect_attack_reason, etc. AI Foundry and the aggregation pipeline expect the xpia_ prefix established by the original schema.

Changes

_base_rai_svc_eval.py — _parse_eval_result
- Use "xpia" as the output key prefix for XPIA results regardless of EvaluationMetrics.XPIA.value ("indirect_attack")
- Extract manipulated_content, intrusion, information_gathering as individual flat columns (xpia_manipulated_content, etc.) from scoreProperties, handling both snake_case and camelCase service response keys
- Fallback: auto-rename any legacy indirect_attack_* keys to xpia_* for responses that already used the wrong prefix
_base_rai_svc_eval.py — _evaluate_query_response
- Pass metric_display_name="xpia" for XPIA so the legacy polling endpoint also produces correctly prefixed output keys via parse_response
_evaluate.py
- _aggregate_label_defect_metrics: add "xpia" to handled_metrics so xpia_label columns are recognized and yield xpia_defect_rate aggregate metrics
- _get_token_count_columns_to_exclude: add "xpia" to the known-metrics list so xpia_*_tokens columns are excluded from numeric aggregation

Result: evaluate() with IndirectAttackEvaluator now produces the expected column names and metrics:

# Before fix:
result["rows"][0]["outputs.indirect_attack.indirect_attack_label"]  # wrong
result["metrics"]["indirect_attack.indirect_attack_defect_rate"]    # wrong

# After fix:
result["rows"][0]["outputs.indirect_attack.xpia_label"]                  # ✓
result["rows"][0]["outputs.indirect_attack.xpia_manipulated_content"]    # ✓
result["metrics"]["indirect_attack.xpia_defect_rate"]                    # ✓
result["metrics"]["indirect_attack.xpia_manipulated_content"]            # ✓

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

pypi.org
- Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/pip pip install -q httpx (dns block)
- Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/python /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/python /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/lib/python3.9/site-packages/pip/__pip-REDACTED__.py install --ignore-installed --no-user --prefix /tmp/pip-build-env-k6pgbg15/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i REDACTED -- setuptools>=40.8.0 (dns block)
- Triggering command: /home/REDACTED/work/azure-sdk-for-python/azure-sdk-for-python/.venv/bin/pip pip install -q httpx devtools_testutils pytest pytest-asyncio de/node/bin/bash (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

This section details on the original issue you should resolve

<issue_title>IndirectAttackEvaluator not uploading/displaying results in AI Foundry correctly</issue_title>
<issue_description>- Package Name: azure.ai.evaluation

Package Version: 1.15.3
Operating System: MacOS
Python Version: 3.12

Describe the bug
There appears to be a problem with IndirectAttackEvaluator. After data has been simulated with query/response pairs, and then passed/uploaded to AI Foundry, the evaluation results do not appear in the Foundry portal even though the results (returned back programmatically) prove that the evaluation ran correctly.

It's not clear whether the problem is with the SDK or Foundry. This is a blocker for all RAI evaluations that rely on indirect jailbreaking using the IndirectAttackEvaluator class.

To Reproduce

import os
from typing import Any, Dict, List, Optional

from azure.ai.evaluation import IndirectAttackEvaluator, evaluate
from azure.ai.evaluation.simulator import IndirectAttackSimulator
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AzureOpenAI


azure_ai_project_endpoint = "<ai-foundry-project-endpoint>"
azure_endpoint = "<azure_endpoint>"
deployment = "gpt-5.1"
api_version = "2025-03-01-preview"

# sample application
def call_llm(
    query: str
) -> str:
    token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
    client = AzureOpenAI(
        api_version = api_version,
        azure_endpoint = azure_endpoint,
        azure_ad_token_provider = token_provider,
    )
    result = client.responses.create(
        model = deployment,
        input = query,
    )
    return result.output_text

async def callback(
          messages: List[Dict],
          stream: bool = False,
          session_state: Any = None,
          context: Optional[dict[str, Any]] = None,
    ) -> dict:
    messages_list = messages["messages"]
    query = messages_list[-1]["content"]
    context = None

    # Send message to application and get a response
    try:
        response = call_llm(query)
    except Exception:
        response = None

    # Format response in OpenAI message protocol
    message = {"content": response, "role": "assistant", "context": context}
    messages["messages"].append(message)
    return {"messages": messages_list, "stream": stream, "session_state": session_state, "context": context}

# set up and run simulator
indirect_simulator = IndirectAttackSimulator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential())

sim_results = await indirect_simulator(
    target=callback,
    max_conversation_turns=3,
    max_simulation_results=5,
)

# save simulated results to file
with open("indirect_jailbreak_example.jsonl", "w") as file:
    file.write(sim_results.to_eval_qr_json_lines())

# set up evaluator and evaluate the simulated jailbreak conversations
indirect_evaluator = IndirectAttackEvaluator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential(),
)

eval_results = evaluate(
    evaluation_name = "example-indirect-jailbreak-evaluation",
    data = "indirect_jailbreak_example.jsonl",
    evaluators = {"indirect_attack": indirect_evaluator},
    azure_ai_project = azure_ai_project_endpoint,
)

Expected behavior
I expect to see the eval results/scores get reported and summarized correctly in Foundry. Currently no scores are recorded even though the object eval_results shows clear proof that the evaluator ran correctly.

After more testing, this class worked up through the v1.14.0 release. The problem began in the v1.15.0 release.

Screenshots

Additional context
Add any other context about the problem here.
</issue_description>

Comments on the Issue (you are @copilot in this section)

@kashifkhan Thank you for the feedback and detailed repro. Ill forward this to the right team who can help you further

Fixes IndirectAttackEvaluator not uploading/displaying results in AI Foundry correctly #45639

📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

…ry display The IndirectAttackEvaluator was producing output keys with indirect_attack_ prefix instead of the expected xpia_ prefix, causing results to not display in AI Foundry. Root cause: EvaluationMetrics.XPIA.value was changed to indirect_attack for service API calls, but output column names must use the xpia prefix for backward compatibility with AI Foundry's expected column naming format. Changes: - _base_rai_svc_eval.py: Use xpia as output key prefix for XPIA results in _parse_eval_result, extract sub-metrics (manipulated_content, intrusion, information_gathering) as flat keys, add fallback to rename legacy indirect_attack_* keys to xpia_* - _base_rai_svc_eval.py: Pass metric_display_name=xpia for legacy path - _evaluate.py: Add xpia to handled_metrics in _aggregate_label_defect_metrics and to known metrics in _get_token_count_columns_to_exclude - Add unit tests for XPIA-specific behavior - Update CHANGELOG.md Co-authored-by: needuv <40341266+needuv@users.noreply.github.com>

Initial plan

3c97eef

Copilot AI assigned Copilot and needuv Mar 16, 2026

Copilot started work on behalf of needuv March 16, 2026 18:15 View session

Copilot AI changed the title ~~[WIP] Fix IndirectAttackEvaluator results upload to AI Foundry~~ Fix IndirectAttackEvaluator results not displaying in AI Foundry portal Mar 16, 2026

Copilot AI requested a review from needuv March 16, 2026 18:46

Copilot finished work on behalf of needuv March 16, 2026 18:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `IndirectAttackEvaluator` results not displaying in AI Foundry portal#45721

Fix `IndirectAttackEvaluator` results not displaying in AI Foundry portal#45721
Copilot wants to merge 2 commits intomainfrom
copilot/fix-indirect-attack-evaluator-results

Copilot AI commented Mar 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

I tried to connect to the following addresses, but was blocked by firewall rules:

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 16, 2026 •

edited

Loading